federated accelerated stochastic gradient descent
Federated Accelerated Stochastic Gradient Descent
We propose Federated Accelerated Stochastic Gradient Descent (FedAc), a principled acceleration of Federated Averaging (FedAvg, also known as Local SGD) for distributed optimization. FedAc is the first provable acceleration of FedAvg that improves convergence speed and communication efficiency on various types of convex functions. For example, for strongly convex and smooth functions, when using M workers, the previous state-of-the-art FedAvg analysis can achieve a linear speedup in M if given M rounds of synchronization, whereas FedAc only requires M^ rounds. Moreover, we prove stronger guarantees for FedAc when the objectives are third-order smooth. Our technique is based on a potential-based perturbed iterate analysis, a novel stability analysis of generalized accelerated SGD, and a strategic tradeoff between acceleration and stability.
Review for NeurIPS paper: Federated Accelerated Stochastic Gradient Descent
Summary and Contributions: The paper proposes a new version of Local-SGD/Federated Averaging algorithm -- Federated Accelerated SGD (FedAc). In particular, the algorithm solves a smooth convex expectation minimization problem in a distributed/federated fashion: M workers in parallel can access the stochastic gradients of the objective function and periodically communicate with a parameter-server. FedAc is a combination of AC-SA method from (Ghadimi and Lan, 2012) and Federated Averaging. Authors propose a first analysis of this method for generally strongly convex functions (in the convex case this method was analyzed in (Woodworth et al., 2020), but only for quadratic objectives) under the assumption that the variance of the stochastic gradients is uniformly bounded. The derived bounds outperform the state-of-the-art result for federated methods in this setting, and these rates are close to the accelerated ones. Moreover, authors show how their bounds improve under the additional assumption that the Hessian is Lipschitz continuous, and the 4-th central moment of the stochastic gradient is bounded and also extended known results for Local-SGD (FedAvg) to this case.
Federated Accelerated Stochastic Gradient Descent
We propose Federated Accelerated Stochastic Gradient Descent (FedAc), a principled acceleration of Federated Averaging (FedAvg, also known as Local SGD) for distributed optimization. FedAc is the first provable acceleration of FedAvg that improves convergence speed and communication efficiency on various types of convex functions. For example, for strongly convex and smooth functions, when using M workers, the previous state-of-the-art FedAvg analysis can achieve a linear speedup in M if given M rounds of synchronization, whereas FedAc only requires M ⅓ rounds. Moreover, we prove stronger guarantees for FedAc when the objectives are third-order smooth. Our technique is based on a potential-based perturbed iterate analysis, a novel stability analysis of generalized accelerated SGD, and a strategic tradeoff between acceleration and stability.